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Abstract. One of the major challenges of ECoG-based Brain-Machine 
Interfaces is the movement prediction of a human subject. Several meth- 
ods exist to predict an arm 2-D trajectory. The fourth BCI Competition 
gives a dataset in which the aim is to predict individual finger move- 
ments (5-D trajectory). The difficulty lies in the fact that there is no 
simple relation between ECoG signals and finger movement. We pro- 
pose in this paper to decode finger flexions using switching models. This 
method permits to simplify the system as it is now described as an en- 
semble of linear models depending on an internal state. We show that 
an interesting accuracy prediction can be obtained by such a model. 

1 Introduction 

Some people who suffer some neurological diseases can be highly paralyzed 
because they do not have anymore control on their muscles. Therefore, their 
only way to communicate is by using their electroencephalogram signals. Brain- 
Computer interfaces (BCI) research aim at developing systems that help those 
disabled people communicating with machines. Non-invasive BCIs have recently 
received a lot of interest because of their easy protocol for sensors implanta- 
tion on the scalp surface [112] . Furthermore, although the electroencephalogram 
signals have been recorded through the skull, those BCI have shown great per- 
formance capabilities, and can be used by real Amyotrophic Lateral Sclerosis 
(ALS) patients f5]i] . 

However, non-invasive recordings still show some drawbacks including poor 
signal to noise ratio and poor spatial resolution. Hence, in order to overcome 
these difficulties, invasive BCI may be used. For instance, Electrocorticographic 
recordings (ECoG) have recently received a great amount of interest owing to 
their semi-invasive nature as they are recorded from the cortical surface. Indeed, 
they offer higher spatial resolution and they are far less sensitive to artifact 
noise. Feasibility of invasive-based BCI have been proven by several recent papers 
|5I6I7I8| . In many of these papers, the BCI paradigm considered is motor imagery 
yielding thus to a binary decision BCI. 

A recent breakthrough has been made by Schalk et al. [9] which has proven 
that ECoG recordings can lead to multiple-degree BCI control. Followed by 



Pistohl et al. [TU], these two works have considered the problem of predicting 
arm movements from ECoG signals. Both approaches are based on estimating 
a linear relation between features extracted from ECoG signals and the actual 
arm movement. 

In this work, we investigate a finer degree of resolution in BCI control by 
addressing the problem of estimating finger ficxions through ECoG signals. In- 
deed, we propose in this paper a method for decoding finger movements from 
ECoG data based on switching models. The underlying idea of switching mod- 
els is the hypothesis that movements of each of the five fingers are triggered 
by an internal discrete state that can be estimated and that all finger move- 
ments depend on that internal state. While such an idea of switching models 
have already been successfully used for arm movement prediction on monkeys 
from micro-clcctrode array measures here, we develop a specific approach 
adapted to finger movements. The global method has been tested on the 4th 
Datasct of the BCI Competition IVj2]. 

The paper is organized as follows : First, we present the dataset from the 
BCI Competition IV used in this paper, then we explain our decoding method 
used to obtain finger flexion from ECoG signals. Finally we present the results 
obtained with our method and we discuss several ways of improving them. 

2 Dataset 

For this work, the fourth dataset from the BCI Competition IV [l^ was used. 
The subjects were 3 epileptic patients who had platinium electrode grids placed 
on the surface of their brain. The number of electrodes vary between 48 to 64 
depending on the subject and their position on the cortex was unknown. 

Electrocorticographic (ECoG) signals of the subject were recorded at a IKHz 
sampling using BCI2000 [13]. A band-pass filter from 0.15 to 200Hz was applied 
to the ECoG signals. The finger flexion of the subject was recorded at 25Hz and 
up-sampled to IKHz. Due to the acquisition process, a delay appears between 
the flnger movement and the measured ECoG signal. To correct this time-lag 
we apply the 37 ms delay proposed in the dataset description [T2] to the ECoG 
signals. 

The BCI Competition dataset consists in a 10 minutes recording per subject. 
6 minutes 40 seconds (400,000 samples) were given for the learning models and 
the remaining 3 minutes 20 seconds (200,000 samples) were used for testing. 
However, since the flnger flexion signals have been up-sampled and thus are 
partly composed of artificial samples, we have down-sampled the number of 
points by a factor of 4 leading to a training set of size 100,000 and a testing 
set of size 50,000. The 100,000 samples provided for learning have been splitted 
in a training (75,000) and validation set (25,000). Then, all parameters of the 
approach have been optimized in order to maximize the performance on the 
validation set. Note that all results presented in the paper have been obtained 
using the testing set provided by the competition (after up-sampling then back 
by a factor of 4). 



In this competition, method performance was measured through the cross- 
correlation between the measured and the estimated finger flexion. The cor- 
relation were averaged across fingers and across subject to obtain the overall 
method performance. Note that the fourth finger was not used for evaluation in 
the competition since its movements were proven to be correlated with the other 
one movements |12| . 

3 Finger flexion decoding using switching linear models 

This section presents the full methodology we have used for addressing the prob- 
lem of estimating finger flexions from ECoG signals. In the first part, we propose 
an overview of the switching models. Then, we describe how we learn the func- 
tion that estimates which finger is about to move. Afterwards, we detail the 
linear models associated to each moving finger. Finally, we briefly detail how the 
complete method works in the decoding stage. 

3.1 Overview 

In order to obtain an efficient prediction of finger fiexions, we have made the 
hypothesis that for such movements the brain can be understood as a switching 
model. This translates into the assumption that the measured ECoG signals and 
the finger movements are intrinsically related by an internal state k. In our case, 
this state corresponds to each finger moving, fc = 1 for the thumb to fc = 5 for the 
baby finger or fc = 6 for no finger movement. Here, we used mutually-exclusive 
states because the experimental set-up considered specifies that only one or no 
finger is moving. Figure [T] gives the picture of our finger movement decoding 
scheme. Basically, the idea is that based on some features extracted from the 
ECoG signals, the internal hidden state triggering the switching finger models 
can be estimated. Then, this state allows the system to select an appropriate 
model Hfc(x) for estimating all finger fiexions, with x being a feature vector. 

For the complete model, we need to estimate the function /(■) that maps the 
ECoG features to an internal state fc S {1, • • • ,6} and the functions H/c(-) that 
relates the brain signals to all finger flexion amplitudes. The next paragraphs 
present how we have modeled these functions and how we have estimated them 
from the data. 

3.2 Moving finger estimation 

The methodology used for learning the /(•) function which estimates the moving 
finger is given in the sequel. 

Feature extraction For this problem of estimating the moving finger, the 
features we used are based on smoothed Auto-Rcgrcssive (AR) coefficient of 
the signal. The global overview of the feature extraction procedure is given in 
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Fig. 1: Diagram of our switching models decoder. We see that from the ECoG 
signals, we estimate two models, (bottom flow) one which outputs a state k 
predicting which finger is moving and (top flow) another one that, given the 
predicted moving finger, estimates the fiexion of all fingers. 
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Fig. 2: Diagram of the feature extraction procedure for the moving finger decod- 
ing. Here, we have outlined the processing of a single channel signal. 



Figure [H For a single channel, the procedure is the following. The signal from 
that channel is divided in non-overlapping window of 300 samples. For each 
window, an auto-regressive model has been estimated. Thus, AR coefficients 
arc obtained at every 300 samples (denoted by the vertical dashed line and 
the cross in Figure [5]). In order to have a continuous AR coefficients value, a 
smoothing spline-based interpolation between two consecutive AR coefficients 
has been used. Note that instead of interpolating, we could have computed the 
AR coefficients at each time instant, however, the approach we propose here has 
the double advantage of being less-computationally demanding and of providing 
some smoothed (and thus more robust to noise) AR coefficients. Finally, only the 
two first AR coefficients are used as features. Signal dynamics have been taken 
into account by applying a similar procedure to shifted version of the signal 
at {+te and —tg). Hence, for measurements involving 48 channels, the feature 
vector at a time instant t is obtained by concatenating features extracted from 
all channels, leading to a resulting vector of size 48 x 3 x 2 = 240. 

Channel Selection Actually, we do not consider in the model all the channels. 
Indeed, a channel selection algorithm has been run in order to reduce the number 



of channels. For this channel selection procedure, the feature vector Xf at time t 
has been computed as described above, except that we have not considered the 
shifted signal versions and used only the first AR coefficient. 

Then, for each finger, based on the training set, we estimated a linear re- 
gression y = x*Cfe where x G is a feature vector of number of channels 
dimension, y = {1,-1} stating if the considered finger is moving or not. Once, 
we have estimated the coefficient vector for each finger, we selected the K 
channels that present the largest values of : 

fc=i 

where the absolute value is considered as element-wise. This channel selection 
allows us to reduce substantially the number of channels so as to minimize the 
computational effort needed for estimating and evaluating the function /(•) and 
it yields better performance. K has been chosen so that the cross-correlation on 
the validation set is maximal. 

Model estimation The model for estimating which finger is moving is a more 
sophisticated version of the one used above for channel selection. At first, since 
the finger movements are mutually-exclusive, we have considered a winner-takes- 
all strategy : 

/(x) = argmax/fc(x) (1) 

fc=l,--- ,6 

Here again, fk (x) is a linear model that is trained by presenting couples of fea- 
ture vector and a state y = {1,-1}. The main differences between the channel 
selection procedure and the one used for learning fk{-) are that : the features 
here take into account some dynamics of the ECoG signals and a finer feature 
selection has been performed by means of a simultaneous sparse approximation 
method. 

Let us consider the training examples {yit,yt}t=i where Xj G R"^, yt.k = 
{1,-1}, being the fc-th entry of vector yt, t denoting the time instant and k 
denoting all possible states (including no finger moving). yt,k tells us whether 
the finger k is moving at time t. Now, let us define the matrix Y, X and C as : 

[Y]t,fc = ytM = xt^j = C],k 

where Xtj and Cj^k are the j-th components of respectively x^ and Cj.. The aim 
of simultaneous sparse approximation is to learn the coefficient matrix C while 
yielding the same sparsity profile in the different finger models. The task boils 
down to the following optimization problcm:x 

C = argmin ||Y - XC||2, + A, V \\C,,. ^ (2) 



ECoG 
Features 




Finger 
Flexion 



Finger 
Movine 





i i i Mat 








/WWW' ■ ■ ■ ■ 














6 1 je 


3 6; 4 je; i je 5 




'Hi 



Fig. 3: Workflow of the learning sets extraction (X^ and Y^) and estimation of 
the linear models Hfc. 



where As is a trade-off parameter that has to be appropriately tuned and . 
being the i-th row of C. Note that our penalty term is a mixed £i — £2 norm 
similar to those used for group-lasso. Owing to the £1 penalty on the £2 row-norm, 
such a penalty tends to induce row-sparse matrix C. Problem ([2]) has been solved 
using the block-coordinate descent algorithm proposed by Rakotomamonjy [14] . 

3.3 Learning finger flexion models 

Here, we discuss the model relating the ECoG data and finger movements for 
every possible values of k. In other words, supposing that a given finger, say the 
index, is going to move (as predicted by our finger moving estimation) , we built 
an estimation of all finger movements. Hence, for each fc, we are going to learn 
a linear model 3fe,j(x) = x^hj''^ with j = 1, • • • , 5, x a feature vector and hj''^ a 
weighting vector indexed by the moving finger k and the finger j which flexions 
are to estimate. We have chosen a linear model since they have been shown to 
provide good performances for decoding movements from ECoG |10|9j . 

At a time t, the feature vector Xt has been obtained by following the same 
line as Pistohl et al. [TU]. Indeed, we use filtered time-samples as features. Xt has 
been built in the following way. All channels have been filtered with a Savitsky- 
Golay (third order, 0.4 s width) low-pass filter. Then, Xt is composed of the 
concatenation of the time samples at t, t — t and t -I- r for all smoothed signals 



at all channels. Samples at i — t and t + r have been used in order to to take into 
account some temporal delays between the brain activity and finger movements. 

Now, let us detail how, for a given moving finger k, the weight matrix H/j = 
[hj*^' • ■ • hg'^^] has been learned. For a given finger fc, we have used as a training set 
all samples where that finger is known to be moving. For this purpose, we have 
manually segmented the signals and extracted the appropriate signal segments, 
needed for building the target matrix Y^, which contains all finger positions, 
and for extracting the feature matrix . This training samples extraction stage 
is illustrated on Figure [31 Then for learning the global linear model, we have 
solved the following multi-dimensional ridge regression problem. 

min||Yfc-XfeHfe|||. + A,||Hfc|l|, (3) 

with Xk being a regularization parameter that has to be tuned. 

For this problem of finger movements estimation, we also noted that feature 
selection helps in improving performance. Again, we have used the estimated 
weighting matrix H coefficients for pruning the model. Indeed, we have kept in 
the model the M features which correspond to the M largest entries of vector 
^i=i l^i'^' I - For possible k and subjects, M is chosen as to minimize a validation 
error. Note that such an approach for pruning model can be interpreted as a 
shrinkage of a least-square parameters. 

3.4 Decoding finger movement 

When all models have been learned, the decoding scheme is the one given in 
Figure [H Given the two feature vectors X( and Xt at a time t, the finger position 
estimation is obtained as: 

y = x^H^, with k = argmaxx*Cfc (4) 

A: 

with y being a row vector containing the estimated finger movement, k the 
finger that is supposed to move, Xt the extracted feature at time t and the 
estimated linear model for state k. 



4 Results 

In this section we present the performance of our switching model decoder. At 
first, we explain how all parameters of the models have been set. Then, we present 
some results which help us understanding the contribution of the different parts 
of our models. Finally, we evaluate our approach and compare ourselves to the 
BCI competition results. 
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Tabic 1: Number of samples used in the validation step for subject 1 
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0.4191 


0.5554 


0.7128 


0.5625 
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0.4321 


0.4644 


0.6541 


0.5169 


3 


0.6162 


0.3723 


0.2492 


0.4126 


4 


0.4091 


0.5668 


0.0781 


0.3513 


5 


0.4215 


0.5165 


0.5116 


0.4832 


Avg. 


0.4596 


0.4951 


0.4411 


0.4653 



Table 2: Correlation coefficient obtained by the linear models h^*'\ 



4.1 Parameter selection 

The parameters used in the moving finger estimation are selected by a validation 
method on the last part of the training set (75,000 for the training, 25,000 for 
validation). We suppose that the size of the set is important enough to avoid 
over-fitting. Using this method, we select the number of selected channels, the 
time-lag ts used in feature extraction and the regularization term As of Eq. 

Similarly, all parameters(r, the number of selected channels and Afc) needed 
for estimating Hfc have been set so that they optimize the model performance 
of on the validation sets. For this model selection part, the size of training and 
validation sets vary according to k and they are summarized in Table [T] for 
subject 1. 

4.2 Evaluating of the linear models 

Models Hfc correspond to the linear regressions between the ECoG features and 
the finger flexions when the k-th finger is moving. The signals used for evaluating 
these models are extracted in the same manner as the learning sets yfc and 
(see Figure |4]) but by assuming that the true segmentation of finger movements 
are known. To evaluate these models, we measure the cross-correlation between 
the true and estimated finger flexion X^h^ only when the finger k is moving. 
The correlations can be seen on Table [2] 

We observe that by using a linear regression between the ECoG signals and 
the finger flexions, we achieve a correlation of 0.46 (averaged across fingers and 
subjects). This results correspond to those obtained for the arm trajectory pre- 
diction (Schalk [9] obtained 0.5 and Pistohl [ID] obtained 0.43). 



Measured and Estimated finger flexion with exact sequence (finger 1 , subject 1 ) 



Extracted signal for evaluation of the linear model corr=0.41 (finger 1 , subject 1 , finger 1 moving) 



Fig. 4: Signal extraction for linear model estimation: (upper plot) full signal with 
segmented signal, corresponding to moving finger, bracketed by the vertical lines 
and (lower plot) the extracted signal corresponding to the concatenation of the 
samples when finger 1 is moving. 



4.3 Evaluating the switching decoder method 

In order to evaluate the efficiency of the switching model decoder and each block 
of the decoder contribution. We report three different results: first, for a given 
finger, we compute the estimated finger fiexion using a linear model learned on 
all samples (including those where the considered finger is not moving), then we 
decode finger flexions with our switching decoder while assuming that the exact 
sequence of hidden states is knowr{3 and finally we use our switching decoder 
with the estimated hidden states. 

For a sake of baseline comparison with our switching models decoder, we 
have estimated the finger fiexions by means of a single linear model which has 
been trained using all the time samples. The obtained correlation are given in 
Table I5al and the regression result can be seen on the upper plots of Figure [51 We 
can see that the correlation obtained are rather low due the fact that without 
switching models the amplitude of the flexion signals remains small. 

The switching model decoder is a two-part process as it requires to have the 
linear models Hfe and the sequence of hidden states. First we apply the decoder 
using the true sequence obtained thanks to the actual finger flexion. Suppose 
that we have the exact sequence k and we apply the switching decoder with 
this sequence. We know that these results may never be attained as it supposes 
the sequence labeling method to be perfect but it gives a interesting idea of the 
maximal performance that our method can provide for given linear models H^. 
Results can be seen in the middle plots of Figure [5] and correlations are in Table 
I3bl We obtain a high accuracy accross all subjects with an average correlation of 
0.61 when using an exact sequence. This proves that the switching model can be 
efficiently used for decoding ECoG signals. Note that by using switching linear 
models, we include a switching mean that induce a high accuracy of correlation. 



^ This is possible since the finger movements on the test set are now available 



Result with a unique regression corr=0.18 (subject 1, finger 1 ) 



Result with a unique regression corr=0.ia (subject 1, finger 2) 
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Result with switching models and exact sequence corr=0.80 (subject 1 , finger 1 ) 




Result with switching models and exact sequence corr=0.74 (subject 1 , finger 2) 

6 I 1 

Result with switching models and estimated sequence corr=0.61 (subject 1 , finger 2) 
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(a) Subject 1, Finger 1 



(b) Subject 1, Finger 2 



Fig. 5: True and estimated finger flexion for (upper plots) a global linear re- 
gression, (middle plots) switching decoder with true moving finger segmentation 
and (lower plots) with the switching decoder with an estimated moving finger 
segmentation. 



Finally, we use our global method for obtaining the finger movement esti- 
mation. In other words, we used the switching models Hfc to decode the signals 
with Equation (j4|) and the estimated sequence k. The finger movement estima- 
tion can be seen on the lower plot of Figure [5b] and the correlation measures are 
in Table [Sc] As expected, the accuracy is lower than the one obtained with the 
true segmentation. However, we obtained an average correlation of 0.42 which 
is far better than when using a global regression approach. These predictions 
of the finger flexions were presented in the BCI Competition and achieved the 
second place. Note that the last 3 fingers have the lowest correlation. Those one 
are highly physically correlated and they are much more difficult to discriminate 
than the two first ones. The first finger is by far the best estimated one as we 
obtained a correlation averaged accross subject of 0.56 . 



4.4 Discussion and future works 

The results presented in the previous section corresponds to the method used 
for the BCI Competition. 

We first note that the best performance obtained by Liang et al. [15] gives 
a correlation of about 0.46. Their method considers an amplitude modulation 
along time to cope with the abrupt change in the finger fiexions amplitude along 
time. Such an approach is somewhat similar to ours since they try to distinguish 
situations where fingers are moving or not. 

Then, we believe that our approach can be improved in several ways. 

Indeed, we choose to use linear models depending on the internal states, 
but [To] proposed to use a kalman filter for the decoding of movement. This 
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(a) Linear regression (b) Switching models (exact sequence) 
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0.7016 


0.3533 


0.6457 
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0.3045 
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(c) Switching models (est. sequence) 



Table 3: Correlation between measured and estimated movement for a global 
linear regression (a), switching decoder with exact sequence (b) and switching 
decoder with an estimated sequence (c) 



approach may be extended to using switching kalman filters in the switching 
model decoder. 

Furthermore, our approach for estimating the sequence of hidden states can 
be highly improved. Liang [15] proposed to use Power Spectral Densities of the 
ECoG channel as features and wc believe that these features may be added and 
used in the sequence labeling. In our method the features arc low-pass filtered in 
order to increase recognition performance, but other sequence labeling methods 
like HMM [11] have been used in BCI. Other sequence labeling methods like 
Conditional Random Fields [TB] known to outperform HMM in some case or 
Sequence SVM may be used to get a better sequence of hidden states. 

Another interesting approach that may be investigated is the mixture of 
sources approach. Indeed, one may considered that each moving finger is associ- 
ated to a source of ECoG signals. Then, the problem of identifying which finger 
is moving may boil down to a source separation problem. 

5 Conclusions 

In this paper, we present a method for the decoding finger flexions from ECoG 
signals. The decoder is based on switching linear models. Our approach has been 
tested on a the BCI Competition IV Dataset 4 and achieved the second place 
in the competition. Results show that the switching model approach produce 



better result than using a unique model. Furthermore an accurate finger flex- 
ion estimation may be achieved when using an exact sequence of hidden states 
showing the interest of the switching models. 

In future works, we plan to improve the result of the switching models decoder 
by two different approaches. On the one hand, we can use more general models 
than linear ones for the movement prediction (switching kalman filters, non- 
linear regression). On the other hand we can improve the sequence labeling 
along time with new approach and by using new features extracted from the 
ECoG signals. 
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